# Self-supervised Pretraining
Resencl OpenMind SimCLR
The first comprehensive benchmark study model for self-supervised learning on 3D medical imaging data
3D Vision
R
AnonRes
16
0
Resencl OpenMind VoCo
The first comprehensive benchmark study model for self-supervised learning on 3D medical imaging data
3D Vision
R
AnonRes
16
0
Hubert Ecg Large
A self-supervised foundational model for broadly scalable cardiac applications, trained on 9.1 million 12-lead ECGs covering 164 cardiovascular diseases
Molecular Model
Transformers

H
Edoardo-BS
168
1
Berturk Legal
MIT
BERTurk-Legal is a Transformer-based language model specifically designed for prior case retrieval tasks in the Turkish legal domain.
Large Language Model
Transformers Other

B
KocLab-Bilkent
382
6
Molformer XL Both 10pct
Apache-2.0
MoLFormer is a chemical language model pre-trained on 1.1 billion molecular SMILES strings from ZINC and PubChem. This version uses 10% samples from each dataset for training.
Molecular Model
Transformers

M
ibm-research
171.96k
19
Videomae Small Finetuned Ssv2
VideoMAE is a self-supervised pretrained video model based on Masked Autoencoder (MAE), fine-tuned on the Something-Something V2 dataset for video classification tasks.
Video Processing
Transformers

V
MCG-NJU
140
0
Regnety 640.seer
Other
RegNetY-64GF feature/backbone model, pretrained using the SEER method on 2 billion random internet images with self-supervised learning
Image Classification
Transformers

R
timm
32
0
Vit Msn Large
Apache-2.0
Vision Transformer model pretrained using MSN method, excels in few-shot scenarios
Image Classification
Transformers

V
facebook
48
1
Vit Msn Small
Apache-2.0
This vision transformer model is pretrained using the MSN method and is suitable for few-shot learning scenarios, particularly for image classification tasks.
Image Classification
Transformers

V
facebook
3,755
1
Videomae Base Short Ssv2
VideoMAE is a self-supervised pretraining model for videos based on Masked Autoencoder (MAE), pretrained for 800 epochs on the Something-Something-v2 dataset.
Video Processing
Transformers

V
MCG-NJU
112
2
Dit Large Finetuned Rvlcdip
Document image classification model pretrained on IIT-CDIP and fine-tuned on RVL-CDIP, using Transformer architecture
Image Classification
Transformers

D
microsoft
67
8
Dit Base Finetuned Rvlcdip
DiT is a Transformer-based document image classification model, pretrained on the IIT-CDIP dataset and fine-tuned on the RVL-CDIP dataset
Image Classification
Transformers

D
microsoft
31.99k
30
Beit Base Patch16 384
Apache-2.0
BEiT is a vision Transformer-based image classification model pretrained in a self-supervised manner on ImageNet-21k and fine-tuned on ImageNet-1k.
Image Classification
B
microsoft
146
5
Beit Large Patch16 384
Apache-2.0
BEiT is a vision Transformer-based image classification model, pretrained in a self-supervised manner on ImageNet-21k and fine-tuned on ImageNet-1k.
Image Classification
B
microsoft
44
0
Wavlm Base Plus
WavLM is a large-scale self-supervised pretrained speech model developed by Microsoft, pretrained on 16kHz sampled speech audio, suitable for various speech processing tasks.
Speech Recognition
Transformers English

W
microsoft
673.32k
31
Featured Recommended AI Models